Search CORE

416 research outputs found

Manual for mcclust.ext R package

Author: Wade Sara
Publication venue
Publication date: 14/05/2015
Field of study

This R package provides post-processing tools for MCMC samples of partitions to summarize the posterior in Bayesian clustering models. Functions for point estimation are provided, giving a single representative clustering of the posterior. And, to characterize uncertainty in the point estimate, credible balls can be computed

Warwick Research Archives Portal Repository

Bayesian Cluster Analysis

Author: Wade Sara K
Publication venue
Publication date: 15/05/2023
Field of study

Edinburgh Research Explorer

Ultra-fast Deep Mixtures of Gaussian Process Experts

Author: Etienam Clement
Law Kody
Wade Sara
Publication venue
Publication date: 11/06/2020
Field of study

Mixtures of experts have become an indispensable tool for flexible modelling in a supervised learning context, and sparse Gaussian processes (GP) have shown promise as a leading candidate for the experts in such models. In the present article, we propose to design the gating network for selecting the experts from such mixtures of sparse GPs using a deep neural network (DNN). This combination provides a flexible, robust, and efficient model which is able to significantly outperform competing models. We furthermore consider efficient approaches to computing maximum a posteriori (MAP) estimators of these models by iteratively maximizing the distribution of experts given allocations and allocations given experts. We also show that a recently introduced method called Cluster-Classify-Regress (CCR) is capable of providing a good approximation of the optimal solution extremely quickly. This approximation can then be further refined with the iterative algorithm

arXiv.org e-Print Archive

Leveraging variational autoencoders for multiple data imputation

Author: Roskams-Hieter Breeshey
Wade Sara
Wells Jude
Publication venue
Publication date: 30/09/2022
Field of study

Missing data persists as a major barrier to data analysis across numerous applications. Recently, deep generative models have been used for imputation of missing data, motivated by their ability to capture highly non-linear and complex relationships in the data. In this work, we investigate the ability of deep models, namely variational autoencoders (VAEs), to account for uncertainty in missing data through multiple imputation strategies. We find that VAEs provide poor empirical coverage of missing data, with underestimation and overconfident imputations, particularly for more extreme missing data values. To overcome this, we employ

\beta

-VAEs, which viewed from a generalized Bayes framework, provide robustness to model misspecification. Assigning a good value of

\beta

is critical for uncertainty calibration and we demonstrate how this can be achieved using cross-validation. In downstream tasks, we show how multiple imputation with

\beta

-VAEs can avoid false discoveries that arise as artefacts of imputation.Comment: 17 pages, 3 main figures, 6 supplementary figure

arXiv.org e-Print Archive

Pseudo-marginal Bayesian inference for Gaussian process latent variable models

Author: Gadd C.
Shah A. A.
Wade Sara K
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2021
Field of study

A Bayesian inference framework for supervised Gaussian process latent variable models is introduced. The framework overcomes the high correlations between latent variables and hyperparameters by collapsing the statistical model through approximate integration of the latent variables. Using an unbiased pseudo estimate for the marginal likelihood, the exact hyperparameter posterior can then be explored using collapsed Gibbs sampling and, conditional on these samples, the exact latent posterior can be explored through elliptical slice sampling. The framework is tested on both simulated and real examples. When compared with the standard approach based on variational inference, this approach leads to significant improvements in the predictive accuracy and quantification of uncertainty, as well as a deeper insight into the challenges of performing inference in this class of models

Edinburgh Research Explorer

Warwick Research Archives Portal Repository

Colombian Women’s Life Patterns: A Multivariate Density Regression Approach

Author: Antoniano-Villalobos Isadora
Cremaschi Andrea
Piccarreta Raffaella
Wade Sara K
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 12/01/2021
Field of study

Women in Colombia face difficulties related to the patriarchal traits of their societies and well-known conflict afflicting the country since 1948. In this critical context, our aim is to study the relationship between baseline socio-demographic factors and variables associated to fertility, partnership patterns, and work activity. To best exploit the explanatory structure, we propose a Bayesian multivariate density regression model, which can accommodate mixed responses with censored, constrained, and binary traits. The flexible nature of the models allows for nonlinear regression functions and non-standard features in the errors, such as asymmetry or multi-modality. The model has interpretable covariate-dependent weights constructed through normalization, allowing for combinations of categorical and continuous covariates. Computational difficulties for inference are overcome through an adaptive truncation algorithm combining adaptive Metropolis-Hastings and sequential Monte Carlo to create a sequence of automatically truncated posterior mixtures. For our study on Colombian women's life patterns, a variety of quantities are visualised and described, and in particular, our findings highlight the detrimental impact of family violence on women's choices and behaviors.Comment: to appear in Bayesian analysi

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della Ricerca - Bocconi

Edinburgh Research Explorer

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Bayesian Cluster Analysis: Point Estimation and Credible Balls (with Discussion)

Author: Ghahramani Zoubin
Wade Sara
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 19/10/2017
Field of study

Clustering is widely studied in statistics and machine learning, with applications in a variety of fields. As opposed to classical algorithms which return a single clustering solution, Bayesian nonparametric models provide a posterior over the entire space of partitions, allowing one to assess statistical properties, such as uncertainty on the number of clusters. However, an important problem is how to summarize the posterior; the huge dimension of partition space and difficulties in visualizing it add to this problem. In a Bayesian analysis, the posterior of a real-valued parameter of interest is often summarized by reporting a point estimate such as the posterior mean along with 95% credible intervals to characterize uncertainty. In this paper, we extend these ideas to develop appropriate point estimates and credible sets to summarize the posterior of clustering structure based on decision and information theoretic techniques

arXiv.org e-Print Archive

Archivio Ricerca Ca'Foscari

Crossref

VU Research Portal

PubliCatt

Edinburgh Research Explorer

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Warwick Research Archives Portal Repository